Modeling Transcription Factor Binding Sites with Supervised Learning
نویسندگان
چکیده
We present a supervised learning approach to transcription factor binding site modeling for four distinct species. Using the consensus scoring method, we look at binding sites of unequal length and the alignment strategy associated with these binding sites. Pairwise scoring and information content were added to the consensus scoring to further increase accuracy of transcription factor binding site detection. While the information content was unsuccessful, the pairwise scoring regularly increased accuracy rates. From this information we further support the concept of a core region contained within the binding sites.
منابع مشابه
Analysis of Pairwise Dependency Information Content for Representing and Searching for Transcription Factor Binding Sites
Transcription factors are proteins that are able to bind to certain segments of DNA to control gene expression. We present an improvement upon supervised learning approaches used for finding transcription factor binding sites. We look at binding sites of the same length for a single transcription factor and use the Berg and von Hippel scoring method. Pairwise information content of positional d...
متن کاملGenome annotation test with validation on transcription start site and ChIP-Seq for Pol-II binding data
MOTIVATION Many ChIP-Seq experiments are aimed at developing gold standards for determining the locations of various genomic features such as transcription start or transcription factor binding sites on the whole genome. Many such pioneering experiments lack rigorous testing methods and adequate 'gold standard' annotations to compare against as they themselves are the most reliable source of em...
متن کاملA Parametric Joint Model of DNA-Protein Binding, Gene Expression and DNA Sequence Data to Detect Target Genes of a Transcription Factor
This paper concerns with predicting the regulatory targets of a transcription factor (TF). We propose and study a joint model that combines the use of DNA-protein binding, gene expression and DNA sequence data simultaneously; a parametric mixture model is used to realize unsupervised learning, which however can be extended to semi-supervised learning too. We applied the methods to an E coli dat...
متن کاملDiscovering Transcriptional Regulatory Rules from Gene Expression and TF-DNA Binding Data by Decision Tree Learning
Background: One of the most promising but challenging task in the post-genomic era is to reconstruct the transcriptional regulatory networks. The goal is to reveal, for each gene that responds to a certain biological event, which transcription factors affect its transcription, and how several transcription factors coordinate to accomplish specific regulations. Results: Here we propose a supervi...
متن کاملMapping of Transcription Factor Binding Region of Kappa Casein (CSN3) Gene in Iranian Bacterianus and Dromedaries Camels
κ-casein is a glycosilated protein in mammalian milk that plays an essential role in the milk micelles. Control of κ-casein expression reflects this essential role, although an understanding of the mechanisms involved lags behind that of the other milk protein genes. Transcriptional regulation, a first mechanism for controlling the development of organisms, is carried out by transcription facto...
متن کامل